Sorting from Noisier Samples

نویسندگان

  • Aviad Rubinstein
  • Shai Vardi
چکیده

We study the problem of constructing an order over a set of elements given noisy samples. We consider two models for generating the noisy samples; in both, the distribution of samples is induced by an unknown state of nature: a permutation ρ. In Mallow’s model, r permutations πi are generated independently from ρ, each with probability proportional to e−βdK(ρ,πi), where dK(ρ, πi) is the Kemeny distance between ρ and πi the number of pairs they order differently. In the noisy comparisons model, we are given a tournament, generated from ρ as follows: if i is before j in ρ, then with probability 1/2 + γ, the edge between them is oriented from i to j. Both of these problems were studied by Braverman and Mossel [7]; they showed how to construct a maximum-likelihood permutation when the noise parameter (β or γ, respectively) is constant. In this work, we obtain algorithms that work in the presence of stronger noise (βr = Ω̃ ( 1 log2 n ) or γ = Ω̃ ( 1 log1/6 n ) , respectively). In Mallow’s model, our algorithm works for a relaxed solution concept: likelier than nature. That is, rather than requiring that our output maximizes the likelihood over the entire domain, we guarantee that the likelihood of our output is, w.h.p., greater than or equal to that of the true state of nature (ρ). An interesting feature of our algorithm is that it handles noise by adding more noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

What does YouTube Say about Your Product? An Aspect based Approach

Nowadays, customers have a variety of options to gather information about products, which can support their purchasing decisions. More and more customers use YouTube reviews or unboxing videos to get a first impression of different products and interact or discuss with other users in the comment section. Automatically analyzing these comments to gain a better insight about the important product...

متن کامل

Investigating levels of aflatoxin B1 exposure through the settled dust in the working environment in workers of the dry and wet household waste sorting of the recycling industry

Background & Aim: Aflatoxin B1 (AFB1) acts as a genotoxic, cytotoxic, and a potential hepatocarcinogen agent. Any contact with aflatoxins is a main threat to workers in the waste management industry. On the other hand, just a few studies have investigated occupational exposure to mycotoxins in the aforementioned industry. In this study, the exposure level of workers to dust and AFB1 has been in...

متن کامل

Studying the Morphologic Maturation of Aeolian Sand Grains During Transportation Process of Wind Erosion (Case study: Khartouran Erg)

Wind, in the duration of its erosional process, affects considerable changes in a grain's morphologyfrom its removal (detachment) step to sedimentation. In other word, a grain undergoes its gradualevolution during the transit process. In this project, the maturation of Aeolian sand grains had beenstudied upon as based on texture maturity indicator which includes: sorting, mean size grain, round...

متن کامل

Title : Sorting things out - assessing effects of unequal specimen biomass on DNA

14 Environmental bulk samples often contain many taxa that vary several orders of magnitude in biomass. This can be 15 problematic in DNA metabarcoding and metagenomic high-throughput sequencing approaches, as large specimens 16 contribute disproportionately high amounts of DNA template. Thus, a few specimens of high biomass will dominate the 17 dataset, potentially leading to smaller specimens...

متن کامل

Sorting things out: Assessing effects of unequal specimen biomass on DNA metabarcoding

Environmental bulk samples often contain many different taxa that vary several orders of magnitude in biomass. This can be problematic in DNA metabarcoding and metagenomic high-throughput sequencing approaches, as large specimens contribute disproportionately high amounts of DNA template. Thus, a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017